The Complexity of Finding Minimal Voronoi Covers with Applications to Machine Learning
نویسندگان
چکیده
Our goal in this paper is to examine the application of Voronoi diagrams, a fundamental concept of computational geometry, to the nearest neighbor algorithm used in machine learning. We consider the question “Given a planar polygonal tessellation T and an integer k, is there a set of k points whose Voronoi diagram contains every edge in T?” We show that this question is NP-hard. We encountered this problem while studying a learning model in which we seek the minimum sized set of training examples needed to teach a given geometric concept to a nearest neighbor learning program. That is, given a concept which can be described by a planar tessellation, we are seeking to construct the smallest set of points whose Voronoi diagram is consistent with the given tessellation. In a sense, this question captures the difficulty of teaching the nearest neighbor algorithm a simple structure, using a minimal number of examples. In addition, we consider the natural inverse to the problem of computing Voronoi diagrams. Given a planar polygonal tessellation T, we describe an algorithm to find a set of points whose Voronoi diagram is T, if such a set exists.
منابع مشابه
On the computational complexity of finding a minimal basis for the guess and determine attack
Guess-and-determine attack is one of the general attacks on stream ciphers. It is a common cryptanalysis tool for evaluating security of stream ciphers. The effectiveness of this attack is based on the number of unknown bits which will be guessed by the attacker to break the cryptosystem. In this work, we present a relation between the minimum numbers of the guessed bits and uniquely restricted...
متن کاملConsistent Feature Selection for Pattern Recognition in Polynomial Time
We analyze two different feature selection problems: finding a minimal feature set optimal for classification (MINIMAL-OPTIMAL) vs. finding all features relevant to the target variable (ALLRELEVANT). The latter problem is motivated by recent applications within bioinformatics, particularly gene expression analysis. For both problems, we identify classes of data distributions for which there exi...
متن کاملFinding Exact and Solo LTR-Retrotransposons in Biological Sequences Using SVM
Finding repetitive subsequences in genome is a challengeable problem in bioinformatics research area. A lot of approaches have been proposed to solve the problem, which could be divided to library base and de novo methods. The library base methods use predetermined repetitive genome’s subsequences, where library-less methods attempt to discover repetitive subsequences by analytical approach...
متن کاملMachine learning algorithms for time series in financial markets
This research is related to the usefulness of different machine learning methods in forecasting time series on financial markets. The main issue in this field is that economic managers and scientific society are still longing for more accurate forecasting algorithms. Fulfilling this request leads to an increase in forecasting quality and, therefore, more profitability and efficiency. In this pa...
متن کاملIdentification Psychological Disorders Based on Data in Virtual Environments Using Machine Learning
Introduction: Psychological disorders is one of the most problematic and important issue in today's society. Early prognosis of these disorders matters because receiving professional help at the appropriate time could improve the quality of life of these patients. Recently, researches use social media as a form of new tools in identifying psychological disorder. It seems that through the use of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Comput. Geom.
دوره 3 شماره
صفحات -
تاریخ انتشار 1993